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Abstract—Statistical methods based on the Armitage-Doll mathematical model of the carcinogenic 
process are presented for analyzing epidemiologic case-control studies of cancer. These methods 
are proposed to provide inferences regarding the stage(s) in the cancer process at which the 
exposure of interest acts. An example of these methods is given which shows evidence that 
carcinogens in cigarette smoke appear to affect the transition rates for two separate stages in the 
development of lung cancer, and the relative magnitudes of these effects are estimated. The data 
for this analysis came from a European multi-center case-control study of lung cancer. 

The results of the analysis show that: (1) the relative risk of lung cancer among continuing 
smokers compared to nonsmokers of the same age decreases as the age started smoking increases, 
while the rate of smoking stays fixed, a result which indicates a carcinogenic effect on an early stage 
in the process; and (2) the relative risk among ex-smokers compared to continuing smokers having 
the same duration and rate of smoking decreases with time since smoking stopped, a result which 
indicates a carcinogenic effect on a late stage in the process. Both resuits are showndo be best 
described by the hypothesis that cigarette smoking affects two stages. The estimated^ relative 
magnitudes of cigarettes* carcinogenic effects on the two stages indicate that the largest proportion 
of the total lifetime lung cancer risk among continuing smokers is due to its late stage 'effect, and 
that the proportion of risk due to causes other than smoking varies from 23% among theft smoking 
I—10 cigarettes per day to 6% among those smoking greater than 30 cigarettes per day. These 
findings imply that preventive measures directed toward inducing smokers to stop would have a 
potentially substantial payoff In reducing future lung cancer mortality. 


I 1. INTRODUCTION 

The results of epidemiologic studies of human 
i cancer are becoming more frequently inter- 
I preted in terms of quantitative multistage the¬ 
ories of the carcinogenic process. Mathematical 
i models derived from these theories were origi- 
i nally proposed to explain the observation that 
age-specific cancer incidence and mortality rates 
[ for many human cancers of epithelial origin 

I increase with the fifth or sixth power of age 

(Muller [1], Nordling [2]). The age pattern of 
occurrence of many tumors induced in experi- 
: mental animals by continuous exposure to 

[ chemical carcinogens is also predicted by these 
multistage theories. Experimental regimens in- 
I volving initiation and promotion provide direct 

evidence of the effect of early and late stage 
I events in the carcinogenic process (Stenback et 

I 
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al. [3-5]). In more recent years, the models are 
being used to elucidate the meaning of the 
relationships observed between cancer risk and 
age at exposure, duration of exposure, and time 
since exposure ended (Whittemore [6], Day and 
Brown [7], Brown and Chu [S]). In this paper we 
shall show how particular time patterns of 
cancer risk may be used to infer the stage in 
the cancer process, early or late, at which the 
carcinogen exerts its influence. To demonstrate 
the type of analyses we propose, we present an 
example of the relationship between cigarette 
smoking and lung cancer for which we shall 
estimate the relative contributions of smoking 
upon early and late stages in the development of 
lung cancer. We shall limit our presentation to 
inferences derived from the mathematical model 
of carcinogenesis proposed by Armitage and 
Doll [9]. 
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2. EXAMPLE OF CIGARETTE SMOKING 
AND LUNG CANCER 

For a number of years cigarette smoking has 
been known to be associated with an increased 
risk of developing lung cancer [10]. Epi¬ 
demiologic studies have shown that the risk of 
lung cancer among continuing smokers in¬ 
creases at a high power of the duration of 
smoking. These studies have also shown that 
once an individual stops smoking, his excess 
lung cancer risk ceases to increase and remains 
nearly frozen at the level attained when he last 
smoked. Doll [11], among others, has inter¬ 
preted these findings as implying that smoking 
affects two stages, one early and one late, in a 
presumed multistage carcinogenic process. The 
observation that the absolute excess age-specific 
lung cancer risk among smokers increases with 
the same power of duration of smoking as does 
the risk with age among nonsmokers indicates 
that smoking likely affects the first stage of the 
process. On the other hand, the rapidity of the 
effect of stopping smoking points to smoking’s 
effect at a late stage. In addition, Doll and Peto 
[12] have determined that the dose-response 
between dose rate measured in cigarettes per 
day and response measured as excess lung can¬ 
cer incidence rate appears to be quadratic, 
which they interpreted to imply that smoking 
affects two stages. 

Though there is general agreement that smok¬ 
ing likely affects both an early and a late stage 
in the process of lung cancer development, no 
one has quantified the relative magnitude of 
smoking’s effects upon the two stages. The 
magnitude of these effects could have substan¬ 
tial importance in the planning and study of 
intervention methods to reduce lung cancer 
incidence. If smoking is primarily an early stage 
carcinogen, then the early smoking years will be 
the primary cause of future lung cancer and 
intervention activities directed at teenagers who 
have not yet started to smoke or who have been 
smoking for a short while would be expected to 
have the greatest payoff. However, if smoking 
primarily affects a late stage, then an inter¬ 
vention designed to get people to stop smoking 
at all ages will definitely have an effect on their 
future risk. The remainder of this section de¬ 
scribes our methods for estimating these relative 
effects of smoking on early and late stages in 
lung cancer development. 

The data we use are from a hospital-based 
case-control interview study of individuals 


newly diagnosed with lung cancer in five West¬ 
ern European countries. Patients admitted to a 
study hospital with histologically confirmed 
lung cancer during 1976-80 were entered into 
the study. Two age-, sex- and centre-matched 
controls were selected for each patient. The 
controls were hospital patients in whom a non- 
tobacco related disease had been diagnosed. No 
surrogate interviews were employed. Additional 
details of the study design are described by 
Lubin et ah [13]. That part of the study which 
we shall analyze here consists of 6920 male 
patients and 13,460 male controls. Female 
patients and controls comprise only 11 % of the 
total and thus we have simplified the analysis by 
limiting ourselves to the results for males. We 
classify as smokers only those individuals who 
have ever smoked cigarettes. 

Our method of analysis is to examine the time 
and age patterns of the relative risk of lung 
cancer to quantify the effects of cigarette smok¬ 
ing upon presumed early and late stages in the 
cancer process. Following Day and Brown [7] 
and Doll and Peto [12] we assume that cigarette 
smoking increases both the rates of first stage 
and penultimate stage cellular events. Our anal¬ 
ysis is divided into two separate comparisons: 
(1) examination of the manner in which relative 
risk varies with age started smoking among 
continuing smokers; and (2) examination of the 
manner in which relative risk varies with time 
since stopped smoking among ex-smokers. We 
have performed these two separate analyses 
because of the apparently conflicting epi¬ 
demiologic evidence among continuing smokers 
compared to nonsmokers which points to smok¬ 
ing affecting an early stage and the evidence 
from ex-smokers compared to continuing smok¬ 
ers which points to smoking’s effect on a late 
stage. We shall show that the two separate 
analyses are consistent with the hypothesis that 
cigarette smoking affects two stages. 

To estimate these effects of cigarette smoking 
on the first and penultimate stages in the devel¬ 
opment of lung cancer, we use the relative risk 
models given in the Appendix which are derived 
from the results in Day and Brown [7]. Our 
analysis consists of two steps. The first step is to 
use unconditional logistic regression to estimate 
the relative risks for the different categories of 
the time/age variables of interest. The second 
step was to fit the relative risk models in Appen¬ 
dix equations (Al) and (A2) to these relative 
risk estimates to derive estimates for the param¬ 
eters r, and r k _ i . The relative magnitude of 
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these parameters indicates the degree to which 
each stage of the cancer process is affected by 
the carcinogenic exposure. 

We use two separate logistic regression mod¬ 
els to estimate the relative risks of interest. The 
first logistic analysts estimates the effect of age 
started smoking upon the relative risk of lung 
cancer among continuing smokers compared to 
nonsmokers. Ex-smokers were not used in this 
analysis. The model for this comparison con¬ 
tained categorical variables for the matching 
variables study area (seven categories) and age 
at interview (10 categories from age st46 
through age >73), and for the factors number 
of cigarettes smoked per day (four categories, 
1-10 as the referent category, 11-20, 21-30 and 
>31), frequency of inhalation (all the time vs 
less as the referent category), percent of total 
duration smoking nonfiltered cigarettes (all the 
time vs less as the referent category), and the 
variable of primary interest, age started smok¬ 
ing (eight categories from ^14, 15, 16, 17, 18, 
19-20, >21, with nonsmo.kers as the referent 
category). These particular age at interview and 
age started smoking categories were selected to 
produce, as nearly as possible, equal numbers of 
individuals in each category. The estimated 
logistic regression coefficients for the variables 
age started smoking, number of cigarettes 
smoked per day, frequency of inhalation and 
percent of time smoking nonfiltered cigarettes 
are given in Table 1. As expected, the relative 
risk increases with the number smoked per day; 
men who smoke at least U packs per day have 


3 times the risk of men who smoke y pack or less 
each day. Except for those who started smoking 
at age 14 or younger, the relative risk steadily 
declines with increasing age started smoking. 
We examined without success the data for other 
variables which might explain our finding a 
lower relative risk than expected for those who 
started smoking at age 14 or younger; this age 
group had a similar distribution of other risk 
factors as did the other groups. The decrease in 
relative risk with increasing age started exposure 
adjusted for age at diagnosis and amount 
smoked per day implies that smoking affects an 
early stage in the carcinogenic process since 
Appendix equation (Al) with r, = 0 shows that 
the relative risk would be nearly independent of 
age started smoking if smoking affected only a 
late stage. However, the moderate magnitude of 
this decrease is not consistent with the hypoth¬ 
esis that smoking affects only the first stage, for 
if this was the case the decrease would be much 
more pronounced. This is shown graphically in 
Fig. 1 which compares the observed pattern of 
relative risk with patterns predicted by three 
particular hypotheses. The predicted patterns 
are obtained from least squares fits of the 
logarithm of equation (Al) to the estimated log 
relative risks shown in Table 1. The three mod¬ 
els assumed (a) only a late stage affect, r , = 0, 
(b) only an early stage affect, r k _ , = 0, and (c) 
an affect on both stages. Different values of k 
and t were tried, with k = 5 and / = 65 giving 
the best overall fits. This figure shows that the 
hypothesis of two affected stages clearly fits the 



Table 1. Estimated logistic regression coefficients in continuing ciga¬ 
rette smokers compared to nonsmokers 


Variable 

Estimated coefficient 

Relative 

risk 

Age started smoking 

Nonsmoker 


1.0 

^ 14 

1.288 (+0.101) 

3.6 

15 

1-419 (±0.106) 

4.1 

16 

1.390 (+-0.1061 

4.0 

17 

1.384 (+0.110) 

4.0 

18 

1.294 (±0.097) 

3.6 

19-20 

1.225 (±0.094) 

3.4 

>21 

1.077 ( + 0.094) 

.2.9 

Number cigarettes smoked per day 


1-10 


1.0 

11-20 

0.712 ( + 0.059) 

2.0 

21-30 

1.035 ( + 0.070) 

2.8 

>31 

I.184(±0.074) 

3.3 

Frequency of inhalation 

AH the time 

0.117 ( + 0.046) 

1.1 

Percent of time smoked nonfiltered cigarettes 


All the time 

0.179 (±0.049) 

1.2 


PM3003530419 


Source: https://www.industrydocuments.ucsf.edu/docs/hlnj0001 







174S 


Charles C. Brown and Kenneth C. Chu 


\ 



RGE STARTED SMOKING _ 

Fig. 1. Relative risks of lung cancer in smokers vs non- 
smokers of the same age estimated from logistic regression 
model: observed pattern compared to patterns predicted by 
multistage model of carcinogenesis. 


observed pattern substantially better than either 
of the separate single stage hypotheses. 

A separate logistic model estimates the re¬ 
lationship between relative risk and the time 
since smoking stopped using continuing smok¬ 
ers as the referent group. Brown and Chu [8] 
showed that duration of exposure must be con¬ 
trolled for to make valid use of the relative risk 
pattern as an indication of which carcinogenic 
stage is affected by exposure. Therefore, we 
restrict our analysis only to individuals who 
have ever been cigarette smokers. It must be 
kept in mind that individuals who give up the 
smoking habit do so for a variety of reasons, 
some related to health. In a preliminary analysis 
we found that the lung cancer risk relative to 
continuing smokers was greater than unity for 
men who had recently quit smoking. The rela¬ 
tive risk was a maximum for those who had quit 
within one year of the interview. Those individ¬ 
uals who had quit smoking were asked in the 
questionnaire to identify the reason for quitting 
as being health related or not. Figure 2 shows 
the relationship between the years since quitting 
and the proportion of men who answered that 
they quit for health reasons. This figure dearly 
shows that men who had recently quit did so 
more often for health reasons than men who 
had quit a longer time in the past. Since we felt 
this association might bias the relationship be¬ 
tween relative risk and time since stopped smok¬ 
ing, we decided to include those who quit within 
the past year with the continuing smokers, 
eliminate from the analysis the 420 men who 
quit between one and two years before the 
interview, and identify as ex-smokers only those 


men who had quit for more than two years. In 
addition, we included the reason for quitting as 
an adjustment variable in our logistic re¬ 
gression. The regression model also contained 
categorial variables for study area, age at inter¬ 
view, number of cigarettes smoked, duration of 
Smoking (nine categories in five year groups 
beginning with $ 10 years as the referent 
group), frequency of inhalation, percent of time 
smoking nonfiltered cigarettes, and years since 
stopped smoking (10 categories from continuing 
smokers as the referent group to 3, 4, 5-6, 7-8, 
.9—11, 12-15, 16-20, 21-26 and >27 years since 
stopped smoking). The factor age started smok¬ 
ing, shown to be important in the previous 
analysis, was not included here because of its 
multicolinearity with the other factors, current 
age, duration of smoking, and time since quit 
smoking. As for the other logistic analysis, these 
particular years since smoking stopped catego¬ 
ries were selected to produce near equal num¬ 
bers of individuals in each. Table 2 gives the 
logistic regression coefficient estimates for du¬ 
ration of smoking, number of cigarettes smoked 
per day, frequency of inhalation, percent of time 
smoking nonfiltered cigarettes, reason for stop¬ 
ping smoking, and time since stopped smoking. 
The relative risk of lung cancer increases with 
duration of smoking as predicted by the multi¬ 
stage theory, and decreases with increasing time 
since smoking stopped. In addition, the risk 
ratio for an individual who stopped for health 
reasons relative to an individual who stated that 
he stopped for reasons other than health was 
estimated to be 1.3 (p <0.001). As in the pre¬ 
vious analysis the relative risk increased with the 
number of cigarettes smoked per day. The rela¬ 
tive risk estimates are LO, 1.9, 2.4 and 2.8 which 
are nearly equal to those shown in Table 1. 



YEARS SINCE STOPPED SMOKING 


Fig. 2. Proportion of ex-smokers who stopped for reasons 
related to health. 
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Table 2. Estimated logistic regression coefficients in ex-smokers com- 
1 pared to continuing smokers 


Variable 

Estimated coefficient 

Relative 

risk 

Years since stopped smoking 

0-1 


1.0 

3 

-0.014 ( + 0.09S) 

0.99 

4 

-0.248 (±0.106) 

0.78 

5-6 

-0.346 ( + 0.090) 

0.71 

7-8 

-0.375 (±0.102) 

0.69 

9-11 

-0.724 (±0.095) 

0.48 

12-15 

-0.751 (±0.107) 

0.47 

16-20 

-0.939 (+0.116) 

0.39 

21-26 

-0.828 (±0.120) 

0.44 

>77 

-0.907 (±0.137) 

0.40 

Years of smoking 

<10 


1.0 

11-15 

-0.001 (±0.224) 

1.0 

16-20 

0.527 (+0.190) 

1.7 

21-25 

0.750 ( + 0.180) 

• 2.1 

26-30 

0.954 (+0.175) 

2.6 

31-35 

1.083 ( + 0.173) 

3.0 

36-40 

1.235 ( + 0.173) 

3.4 

41-45 

1.338 (±0.174) 

3.8 

>46 

1.512 (±0.173) 

4.5 

Number of cigarettes smoked per day 


1-10 


i.O 

11-20 

0.642 ( + 0.049) 

1.9 

21-30 

0.878 ( + 0.057) 

2.4 

5=31 

1.032 ( + 0.060) 

2.8 

Frequency of inhalation 

AH the time 

O.i 27 ( + 0.039) 

u 

Percent of time smoked nonfiltered cigarettes 


All the time 

0.10! (±0.040) 

it 

Reason for stopping smoking 

Health-related 

0.241 (±0.064) 

1.3 


i 


The risk of lung cancer among ex-smokers 
relative to that among continuing smokers of 
the same age and smoking rate and duration 
decreases with time since quitting, and appears 
to level out and possibly even begin to increase 
after about 20 years. This is displayed graph¬ 
ically in Fig. 3 where the observed pattern is 
compared to patterns predicted by different 
hypotheses concerning the stage(s) affected by 
smoking. These predicted patterns are from 
three least squares fits of the logarithm of 
Appendix equation (A2) to the log relative risk 
estimates in Table 2. The hypothesis that only 
the first stage is affected by smoking is clearly 
rejected by these data since the relative risk 
would be an increasing function of time since 
exposure stopped and the observed pattern is 
clearly decreasing (the “best” fitting increasing 
function is nearly flat as shown in the figure). 
These data are consistent with the hypothesis 
that only the penultimate stage is affected by the 
carcinogenic exposure. However, the single 
affected penultimate stage hypothesis does not 


predict the upturn in relative risk which is 
suggested by these data. A hypothesis that 
assumes smoking affects both the first and 
penultimate stages provides a better description. 



Fig. 3- Relative risks of lung cancer in ex-smokers vs 
continuing smokers of the same age and same smoking 
duration estimated from logistic regression model: observed 
pattern compared to patterns predicted by multistage model 
of carcinogenesis. 
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Table 3, Estimated relative increase in cellular event rates for multi¬ 
stage model applied to relative risks of lung cancer among cigarette 
smokers 


Relative increase caused by smoking 


Number cigarettes 
smoked per day 

First stage (> c ) 

Penultimate stage (r 4 _,) 

1-10 

0.7 

2,8 

11-20 

2.5 

5.0 

21-30 

3.5 

6.3 

>31 

4.0 

7.0 


in terms of a smaller sum of squares, of the 
observed pattern since it does predict an upturn 
in relative risk. This delay period before the 
upturn can be thought of as representing the 
time required for an initiated cell, for which the 
first cellular change was caused by its exposure 
to carcinogens in cigarette smoke, to pass spon¬ 
taneously through the remaining stages, includ¬ 
ing the penultimate in the carcinogenic process. 

Both analyses, the relationship of lung cancer 
relative risk to age started exposure to ciga¬ 
rettes, and the relationship of risk to time since 
exposure stopped, provide consistent evidence 
for the two affected stage hypothesis. To quan¬ 
tify the magnitude of smoking’s effect on the 
two stages, the logarithm of the relative risk 
Functions in Appendix equations (Al) and (A2) 
were simultaneously fit to the log relative risk 
estimates derived from the two sets of logistic 
regression parameter estimates given in Tables 1 
and 2. This was done for each category of 
number of cigarettes smoked daily to estimate r , 
and r k _ | as a function of number of cigarettes 
smoked per day. The relative risks for age 
started smoking given in Table 1 relate to the 
referent category of 1-10 cigarettes smoked per 
day. Since the logistic regression model implic¬ 
itly assumes a multiplicative relation between 
age started smoking and number of cigarettes 
smoked per day, the set of relative risk estimates 
for individuals who smoke more than 1-10 per 
day is given by the product of the appropriate 
relative risks (e.g. the relative risk estimate for 


an individual who started smoking at age 16 
and smoked 21-30 cigarettes per day is 
4.0x2.8 = 11.2). Since the relative risk esti¬ 
mates derived from the age started smoking 
coefficients in Table 1 relate to the referent 
categories of the inhalation and nonfiltered 
cigarette factors, we adjusted the log relative 
risks-to reflect the actual distribution of con¬ 
tinuing smokers who inhaled all the time (69%) 
and who smoked only nonfiltered cigarettes 
(27%). This resulted in adding 0.13 to each log 
relative risk. 

A nonlinear least squares procedure was used 
to estimate the unknown relative risk parame¬ 
ters, ri and r k _ t . The estimated values of >\ and 
r k _ l for each of the four fits are given in Table 
3. These values are based on k = 5, t — 65 and 
d = 25 for each of the four least square fits. We 
found these values to produce the smallest 
overall sum of squares when totalled over the 
four fits. These estimates in Table 3 imply that 
cigarette smoking produces a larger relative 
increase on a late stage event in the carcinogenic 
process than on the event which initiates it. For 
an individual who smokes 21-30 cigarettes per 
day, this carcinogenic exposure is estimated to 
increase the rate of the early cellular event by 
350% over background and the rate of the late 
cellular event by 630%. 

Taken by themselves, these estimated relative 
increases in cellular event rates are difficult to 
interpret. However, in the Appendix we present 
an approach to measure their contributions to 


Table 4. Relative lifetime* risk of lung cancer among continuing smokers who begin at age 17 


Number cigarettes 
smoked per day 

Lifetime risk contribution due to smoking-induced effect 

Background 

Only 

first 

stage 

Only 

penultimate 

stage 

Both first and 
penultimate 
stages 

1-10 

0.23 

0.04 

0.63 

0.10 

11-20 

0.11 

0.06 

0.53 

0.30 

21-30 

0.08 

0.06 

0.47 

0.39 

>31 

0.06 

0.06 

0.45 

0.43 


^Through age 85. 
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the lifetime risk of lung cancer. This method 
divides this risk into four pieces: (i) the back- 
ground risk due to causes other than smoking; 
the excess risk due to smoking causing (2) only 
the first cellular event, (3) only the penultimate 
cellular event, and (4) both the first and penult¬ 
imate events. The results are presented in Table 
4. These calculations assume f 0 = 17 which was 
the median age started smoking for the subjects 
in this study. As expected, this table shows an 
inverse relationship between the number of ciga¬ 
rettes smoked per day and the contribution of 
background to total lifetime risk ranging from 
23% for “light” smokers (1—10 per day) to only 
6% for “heavy” smokers (> 31 per day). The 
magnitude of these percentages agree closely 
with estimates of 85-90% for the amount of 
lung cancer estimated by others to be attri¬ 
butable to cigarette smoking [15]. In addition, 
the largest proportion of the total risk, on the 
order of 50%, is estimated to be from smoking’s 
effect on only the penultimate stage while only 
5% of the total risk is due to smoking’s effect 
on only the first stage. As the exposure rate 
increases, the relative contribution of two cellu¬ 
lar events being caused by smoking increases. 
The proportion oflifetime risk due to smoking’s 
effect on both early and late stage cellular events 
increases from 10% for “light” smokers to 43% 
for “heavy” smokers. 

3. DISCUSSION 

The logistic regression models used in our 
analysis did not involve any interactive effects. 
We searched for the occurrence of such inter- 



v EflRS SINCE QUIT SnGKIS-S 

Fig. 4. Relative risks of lung cancer in ex-smokers compared 
to continuing smokers as a function of time since stopped 
smoking estimated from logistic regression model; pattern 
adjusted for smoking duration compared to pattern 
unadjusted for duration. 


actions which would produce different relative 
risk patterns than observed, but we did not find 
any. The relative risk patterns shown in Figs 1 
and 2 are not changed to any substantial degree 
when we limit the analysis to include only 
individuals who smoked a particular number of 
cigarettes per day (e.g. 1-20 or ^21) or to those 
cases and controls who were at particular ages 
at interview (e.g. <55 or >65). We also exam¬ 
ined the pattern of relative risk with time since 
stopped smoking for individuals with different 
smoking durations (e.g. 21-30 or 31-40 years) 
and found similar results. Thus the general 
patterns of relative risks seen in Figs 1 and 2 are 
consistently found across various strata of indi¬ 
viduals and there appear to be no interactions 
with other factors. Other smoking character¬ 
istics were also examined to see if they should be 
included in the logistic regression models. How¬ 
ever, we found that inhalation frequency and 
percent of time smoking nonfiltered cigarettes 
were the only statistically significant factors. 

We wish to emphasize the importance of 
controlling duration of exposure when studying 
the pattern of relative risk as a function of time 
since exposure ended. Doll and Peto [12] found 
duration of smoking to be the single most 
important factor in the evolution of lung cancer 
risk. Therefore, an association of duration with 
other factors being studied could produce a 
substantial bias. In this study, the correlation 
between the duration of smoking and the time 
since smoking stopped for ex-smokers is —0.6 
indicating that the men who had stopped smok¬ 
ing for many years had smoked for less time 
than those men who had stopped for a shorter 
time. Therefore, without adjusting for duration 
of exposure, a decreasing trend in relative risk 
with increasing time since exposure stopped 
would not necessarily indicate a late stage car¬ 
cinogenic effect since the observed decrease 
might have been caused by the associated de¬ 
crease in exposure duration. As an example of 
the potential for bias, we re-estimated the rela¬ 
tive risks for the years-since-smoking-stopped 
categories without adjusting for smoking du¬ 
ration. Whereas the decreasing trend of the 
relative risks adjusted for duration shown in 
Table 2 flattened out as the years since smoking 
stopped increased, the decreasing trend con¬ 
tinued sharply down when no adjustment as 
made for duration of smoking. This is shown in 
Fig. 4. For the category “»27 years since 
stopped smoking”, the relative risk estimate is 
0.40 when duration adjustment is made (Table 
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2) and is 0.17 when no adjustment is made. In 
this situation the bias produced by smoking 
duration is substantial and could lead to 
different conclusions. 

The primary purpose of our analysis is to 
estimate the relative effects of cigarette smoking 
upon early and late events in the development 
of lung cancer. The results in Tables 3 and 4 
indicate that smoking has a substantially greater 
effect on late stage events than it does on 
initiating events. This implies that preventive 
measures directed toward inducing smokers to 
stop should have a potentially substantial 
payoff in reducing the future burden of lung 
cancer mortality. However, this docs not imply 
that one should ignore interventions designed to 
stop people from ever starting to smoke: since 
the duration of smoking is the single most 
important factor in the evolution of lung cancer, 
the future lung cancer mortality in the popu¬ 
lation will be best reduced by reducing the 
number of individuals who ever begin to smoke. 
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APPENDIX 

Assuming that cigarette smoking affects both the first and 
penultimate stages in a multistage process of lung cancer 
development, relative risk models follow directly from Day 
and Brown [7, equation (4)]. We let / denote the individual’s 
current age, t 0 the age at which smoking began, d the 
duration of smoking, and / the follow-up time since smok¬ 
ing stopped. As a function of age started smoking i 0 , the 
relative risk of lung cancer in continuing smokers vs non- 
smokers of the same age t is. 


RR e 0o) = 




(Al) 


where r, and r t _, denote the relative increases in the first 
and penultimate cellular events produced by exposure to 
cigarette smoke. By relative increase we mean the following: 
let u denote the cellular event tate in the absence of smoking: 
let u + o denote the rate when exposed to cigarette smoke; 
then r = (vju) represents the relative increase in the rate due 
to cigarette smoking. In a similar derivation, as a function 
of the time elapsed since smoking stopped /, the relative risk 
of lung cancer in ex-smokers vs continuing smokers of the 
same age t and the same duration of smoking d is given by, 


RR,(/) = 


r*-> 4-+ /)*-'-/*-'] 

d-nG-id *- 1 ___ 

Z 4 ” 1 -f r l d k ~" i 


(A2) 




For an individual exposed continuously beginning at age 
t 0 , without stopping to a carcinogen which affects both 
the first and penultimate stages in the multistage process, 
the Armitage-Doll model of carcinogenesis leads to the 
following age-specific cancer incidence rates at age i. 


2 0 (t) = d/*-', 

2,(r) = i9r,(/ — f 0 ) k_i , t 51„ 

2*.i(0 = dr t _ I (f‘- ! -r5- 1 ), t^t a , 

1-1(1) ” ~ 1 ? 0- (A3) 


where 8 is a scale factor. 

The function 2 0 (t) represents the spontaneous, or back¬ 
ground rate; 2,(r) represents the excess rate due to a 
carcinogenic effect only on the first stage (e.g. the excess rate 
produced by smoking causing the first cellular event but not 
the penultimate event); ,(/) represents the excess rate due 
to a carcinogenic effect on only on the penultimate stage ; and 
'■i.jk-i(0 represents the excess rate due to carcinogenic 
effects on both the first and penultimate stages. The total 
age-specific risk is the sum of the background and the 
excess. 


X(t) — 2,|(r) + 2,(r) + 7-jt_i(0 ,(t). (A4) 

The lifetime risk can be estimated from competing risk 


considerations as. 
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considerations as, 




where S(l) is the survival probability that an individual is 
alive at age t. This lifetime risk L can be factored into four 
pieces by replacing A(r) in equation (A5) by its component 
parts given in equation (A3). Since our case-control study 
does not provide an estimate of the scale factor 6, we cannot 
use the results of this case-control study to estimate the 
absolute magnitude of smoking’s effect on the lifetime risk 
of lung cancer. However, we may estimate the relative 
contributions of the four components on the lifetime risk by 
dividing each of the four by their total (the unknown scale 
factor cancels out). These relative lifetime risks are, 


-J> 

-f> 


,(/)£(;) d//L, 


,(0S(t)di/L. 


A> = J ^(OS(')d//A 

J*A,(r)S(Odr/Z„ 


Thus, for example, L t represents that proportion of the 
lifetime risk of lung cancer produced by smoking causing 
only the first stage of the carcinogenic process (the other 
stages being caused by "background”). We estimate these 
relative lifetime risks in equation (A6) by approximating the 
integrals by summations over single years Of age up to age 
85 where ■S'(f) represents U.S. national life table survival 
probabilities for white males [14] and the parameter esti¬ 
mates.^ and are from Table 3. For this example we 
assume that the estimates f, and r k _ t! based on a European 
study, are applicable to U.S. males and we use the total U.S. 
male survival rates to approximate the survival rates for 
male smokers. When computing lifetime cancer risks, this 
approximation will produce overestimates since smokers do 
not live as long as nonsmokers. However, our computation 
of proportions of lifetime cancer risk should not be as biased 
by this use of total U.S. male survival rates. 
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